arXiv:1503.00102vl [cs.SE] 28 Feb 2015 


Context-Aware Reliability Prediction 
of Black-Box Services 


Jieming Zhu, Zibin Zheng, and Michael R. Lyu 

Department of Computer Science and Engineering 
The Chinese University of Hong Kong, Hong Kong 
{ jmzhu, zbzheng, lyu}@cse.cuhk.edu.hk 


Abstract —Reliability prediction is an important task in soft¬ 
ware reliability engineering, which has been widely studied in the 
last decades. However, modelling and predicting user-perceived 
reliability of black-box services remain an open research problem. 
Software services, such as Web services and Web APIs, generally 
provide black-box functionalities to users through the Internet, 
thus leading to a lack of their internal information for reliabil¬ 
ity analysis. Furthermore, the user-perceived service reliability 
depends not only on the service itself, but also heavily on the 
invocation context (e.g., service workloads, network conditions), 
whereby traditional reliability models become ineffective and 
inappropriate. To address these new challenges posed by black¬ 
box services, in this paper, we propose CARP, a new context- 
aware reliability prediction approach, which leverages historical 
usage data from users to construct context-aware reliability 
models and further provides online reliability prediction results 
to users. Through context-aware reliability modelling, CARP is 
able to alleviate the data sparsity problem that heavily limits the 
prediction accuracy of other existing approaches. The prelim¬ 
inary evaluation results show that CARP can make significant 
improvement on reliability prediction accuracy, e.g., 41% for 
MAE and 38% for RMSE when only 5% of the data are available. 

Index Terms —Black-box services; reliability prediction; con¬ 
text, matrix factorization 

I. Introduction 

Reliability measures the probability of failure-free soft¬ 
ware operation for a specified period of time in a specified 
environment 0. Reliability prediction is an important task 
in software reliability engineering 0, 0, which can aid 
in evaluating software design decisions for building reliable 
software systems. In the last decades, reliability prediction 
has been widely studied, producing a variety of prediction 
models (e.g., Palladio component model 0, Poisson process 
model 0, etc.). However, most of these existing models target 
at analyzing traditional white-box software systems, where the 
reliabilities of system components are all known or can be 
estimated through behaviour models from internal information 
of the components. 

Nowadays, various software services such as Web services 
and Web APIs are emerging over the Internet. These services 
have become an integral part for building modern Web applica¬ 
tions, in which each service provides a black-box functionality 
via some standard interfaces. To evaluate the reliability of a 
(third-party) black-box service, traditional white-box reliabil¬ 
ity prediction approaches become inapplicable due to a lack 
of its internal behaviour information. In addition, different 


from stand-alone software systems, software services operate 
over the Internet and likely serve different users spanning 
worldwide. Therefore, the user-perceived reliability may differ 
from user to user due to different user locations, and vary from 
time to time due to dynamic service workloads and network 
conditions. In such a setting, it is more suitable to evaluate 
service reliability from user side than from system side as 
evaluating traditional software systems. As a result, modelling 
and predicting user-perceived reliability of black-box services 
remain an open research problem, which is exactly the goal 
of our work. 

Specifically, as with 0, (ED. we compute user-perceived 
service reliability as the ratio of the number of successful 
service invocations against the total number of service invo¬ 
cations performed by the user. The most straightforward way, 
therefore, is to assess the reliability of a target service through 
real invocations from users. However, each service usually 
has many users and each user may need to assess a lot of 
alternative services (with similar or identical functionalities). 
Such exhaustive invocations can impose additional cost for 
users (e.g., the service invocations may be charged) and also 
incur expensive overhead for service systems (e.g., by con¬ 
suming additional system resources), thus making it infeasible 
in practice. It is more desirable to identify approaches that 
can achieve accurate reliability predictions without requiring 
additional service invocations. 

Towards this end, a few initial efforts have been made by 
several recent studies 0, ED, EE where they make use of 
historical usage data (i.e., observed reliability on the invoked 
services) collected from users for collaborative reliability 
prediction. Whereas these approaches achieve encouraging 
results, two significant challenges remain: 1) Data sparsity. 
Each user typically invokes only a few out of all the services, 
thus the user-observed reliability data are extremely sparse in 
practice. With limited training data, it is difficult to make ac¬ 
curate reliability predictions. 2) Context modelling. The user- 
perceived service reliability heavily depends on the invocation 
context (e.g., service workloads, network conditions, etc.). 
How to leverage such context information to aid in reliability 
prediction is still a challenging problem. 

In this paper, we present CARP , a context-aware reliability 
prediction approach that aims to address the above challenges. 
CARP is able to lerverage the implicit context information 
between users and services by a novel model of context- 
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Fig. 1. The Framework of Context-Aware Reliability Prediction 


specific matrix factorization. Evaluations are conducted based 
on a publicly available dataset [0 with real-world reliabiltiy 
samples collected from Amazon EC2 cloud. The experimental 
results has demonstrated the effectiveness of CARP to en- 
chance the reliability prediction accuracy. 

In summary, our paper makes the following contributions: 

• We study the problem of user-perceived reliability predic¬ 
tion of black-box services, which remains an open and 
challenging research problem. 

• We present a context-aware reliability model with its 
constmction for reliability prediction, which can alleviate 
the data sparsity problem that heavily limits the prediction 
accuracy of the existing approaches. 

• The preliminary evaluation results show that CARP can 
make significant improvement on prediction accuracy, 
especially when the available data is sparse. 

The remainder of this paper is organized as follows. Sec¬ 
tion [H] and Section [ill] present the framework and the detailed 
approach of CARP. Section [Ty] reports on our preliminary 
evaluation results. We discuss the related work in Section [V] 
and finally conclude this paper in Section [Vi] 

II. Framework 

Fig. □ presents our context-aware reliability prediction 
framework, which comprises three phases: 1) Data collection. 
A user-collaboration mechanism, proposed in our previous 
work 0, is applied to collecting historical usage data from 
users. Users can contribute their observed reliability data on 
the invoked services and get back personalized (i.e., from user 
side) reliability prediction results. 2) Offline model construc¬ 
tion. Using the collected reliability data, we can construct 
the context-aware reliability models by a set of procedures 
including context identification, context-specific data aggre¬ 
gation, and context-specific matrix factorization. The model 
construction can be performed offline at a periodical interval 
to update the models with newly-observed reliability data. 
3) Online reliability prediction. The constructed reliability 
models can provide personalized reliability prediction results 
to users in an online manner. 

III. Context-Aware Reliability Prediction 

In this section, we describe our context-aware reliability 
prediction approach in detail. 


A. A Context-Aware Reliability Model 

For traditional software systems, the researchers generally 
take reliability as a constant value that measures the probability 
of failure-free software operation. Given a specified period of 
time, the software reliability is defined as follows: 

r{s), (1) 

where s denotes the specific software system. r(s) depends on 
the software-specific parameters such as software architecture, 
system resources (e.g., CPU, memory, and I/O), and other 
software design and implementation factors. 

However, this traditional definition of reliability is not 
applicable for measuring user-perceived reliability of black¬ 
box services. As mentioned before, service reliability should 
be evaluated from user side other than from system side as 
evaluating traditional software systems. Due to the influence 
of user locations and network connections, different users may 
experience quite different reliability even on the same service. 
To characterize user-perceived reliability, Zheng et al. m 

propose the following definition: 

r(u,s), (2) 

where u and s denote the specific user and service respectively. 
r(u , s) depends both on user u and service s. 

Further, this definition is extended to incorporate temporal 
information in 0, since the user-perceived reliability may 
vary from time to time due to fluctuating service workloads 
and dynamic network conditions. Specifically, 

r(u,s,t), (3) 

formulates the user-perceived reliability for an invocation 

inv{u , s, t) between user u and service s at time slice t. 

Although this definition, r(u,s,t), can naturally character¬ 
ize the user-perceived reliability well, we find it difficult to 
directly apply it to reliability prediction. Because each user 
has limited historical usage data, applying r(u, s,f) to model 
the data can lead to the data sparsity problem and thus result 
in inaccurate predictions. 

In this paper, we argue that the time-dimensional charac¬ 
teristics can be typically captured by a finite set of context 
conditions, each of which is a specific representation of the 
underlying factors such as service workloads and network 
conditions. It is further endorsed by the fact that service 

























































































workloads and network conditions likely have regular daily 
distributions 0. Thus, a specific context condition determines 
the reliability value at a specific time slice. Based on this 
observation, we propose a context-aware reliability model: 

r(u,s,c), (4) 

where c denotes the specific context condition under which the 
invocations inv(u,s,t) are performed. r(u,s,c) indicates that 
the user-perceived reliability depends on user u, service s, and 
the context c. Especially, r(u,s,t ) ~ r(u,s,c), if the context 
condition is c at time slice t. In the following, we will describe 
the use of this model for context-aware reliability prediction. 

B. Offline Model Construction 

Formally, we can collect a 3-dimensional matrix R £ 
M m *nxT, w hich records the reliability data for M users, 
N services, and T time slices. R u , s ,t = r{u,s,t) when 
the reliability value r(u,s,t) of invocations inv(u,s,t) is 
observed; otherwise, we set R v _ s _t = 0 as an unknown 
entry. Due to the afore-mentioned data sparsity problem, the 
matrix R is highly sparse in practice with a large number of 
unknown entries. The goal of reliability prediction is to predict 
these unknown entries, whereby the reliability of an ongoing 
invocation can be further predicted. As illustrated in the right 
panel in Fig. [T] the offline model construction comprises three 
steps: context identification, context-specific data aggregation, 
and context-specific matrix factorization. 

1 ) Context Identification: To characterize and identify 
different context conditions, we employ k-means clustering 
technique to cluster the reliability data R with T time slices 
into C clusters, where each cluster represents a specific context 
and different time slices grouped into one cluster belong to 
the same context. To achieve so, the observed reliability data 
between M users and N services at each time slice t can 
be constructed as a feature vector for k-means clustering. 
However, due to the sparse nature of R, the feature vectors 
would become high-dimensional and sparse, further leading to 
bad clustering performance. To overcome this issue, we define 
a feature vector f t for time slice t using the average reliability 
value of each service: 

ft = (r(si,t),f(s2,t),-‘- ,r( s N,t )), ( 5 ) 

where r(s,t) = mean{{R U)S ^ | R u , s ,t >0, 1 < u < M}) 

calculates the average reliability value of service s over the 
observed entries at time slice t. Using these feature vectors, we 
perform data clustering and get C different context conditions. 

2) Context-Specific Data Aggregation: Different time 
slices may be clustered into each context. To alleviate the 
data sparsity problem, we propose to aggregate the data of 
different time slices within the same context. An aggregated 
data matrix R £ R MxAf xC can (}j US be obtained, where each 
entry i? usc denotes the average reliability value between user 
u and service s in context c: 

R u , s,c = mean({R u ,s,t \ R u ,s,t > 0, t £ context c}) (6) 

Especially, R u , s ,c = 0 indicates that the reliability for invo¬ 
cations inv(u,s,t ) performed in context c is unknown. For 


example, in Fig. [T] the observed reliability data of four time 
slices are aggregated into two contexts (i.e., context c\ and 
C 2 ) and thus the aggregated data become much denser. 

3) Context-Specific Matrix Factorization: The problem of 
context-aware reliability prediction is to predict the unknown 
entries (where R u ,s,c = 0 ) of the aggregated data R. This 
can be modelled as a collaborative filtering (CF) problem (as 
with ED), which aims for recovering the full matrix from a 
small number of observed entries. Taking Fig. |T]as an example, 
in the aggregated matrix for context c\, we have four entries 
observed (e.g., Ru :i . Sl , C1 = 0 . 4 ) and five unknown entries to 
predict (e.g., /?,, iS1 )Cl ). Matrix factorization (MF) 0 is a clas¬ 
sic CF model that allows for low-rank matrix approximation. 
Different with conventional matrix factorization, we have a 3- 
dimensional reliability data matrix R £ K MxiVxC ' i including 
one 2-dimensional M-by-N data matrix R tr; in each context 
c (1 < c < C ), where its entry Ft]']] = R UtStC . 

In such a setting, we propose context-specific matrix factor¬ 
ization. Formally, factorizing a data matrix R^ £ R MxAr is to 
map both users and services into a d-dimensional latent factor 
space, such that the values of can be captured as the inner 
products of the corresponding latent factors £ R dxM and 
S ( C ) e R dxN, i e £(c) ~ U^ T S( C \ where U^ T is the 
transpose of U^ c \ Therefore, the context-specific MF model 
for context c is to minimize the following loss function: 



u,s 


where the first term measures the sum of the squared errors 

between the observed value Rif] and the estimated value 

(c)T (c) 

Uu Ss , and the second is a regularization term used to 
avoid the overfitting problem 0. l]f) acts as an indicator: 
lus = 1 if Ru] > 0; In] = 0, otherwise. ||-|| F denotes the 
Frobenius norm 0. and A is a parameter to control the extent 
of regularization. 

The algorithm of gradient descent 0 is usually employed to 
solve the MF model in Equation [7] (see details in Appendix [B}. 
For ease of computation, we solve each context-specific MF 
model sequentially, and employ the solution of the last context 
for initialization of the current one (e.g., use U W and ,S' n 1 
to initialize U^ and S^ 2 )). At last, we can obtain a pair 
of U and S' <: for each context c. In practice, the offline 
model construction can be performed periodically to update 
the models with newly-observed reliability data. 

C. Online Reliability Prediction 

The constructed models (i.e., (/-'■' and S (/:> ) allow for relia¬ 
bility prediction for invocations performed between user u and 

service s in context c, i.e., R u ,s,c = u], 0 ' 1 s] c \ where R de¬ 
notes the predicted matrix of R. This is the basis for perform¬ 
ing online reliability prediction, which aims to predict the user- 
perceived reliability of an ongoing invocation inv(u, s,t c ). 
Therefore, we seek to associate the invocation context at the 
current time slice t c to an existing context c. In our imple¬ 
mentation, we use the newly observed reliability data to help 


Statistics 

Values 

#Records 

17,150 

#Users 

50 

#Services 

49 

#Workloads 

7 

Reliability range 

0 ~ 1 

Reliability average 

0.433 


Fig. 2. Data Statistics 
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Fig. 3. Data Distribution 


identify the current context. Specifically, given the observed 
feature vector f tc = (f{s u t c ),f(s 2 ,t c ),--- ,f(s N: t c )), we 
group it into one of the existing context clusters. After 
obtaining the context c, the reliability of inv(u, s, t c ), denoted 

as f(u , s, t c ), can be predicted by f(u , s, t c ) = R u , s ,c- 

IV. Preliminary Results 

In this section, we present our preliminary results on evalu¬ 
ating the effectiveness of CARP. For ease of reproducing our 
approach, we release our source code with detailed experi¬ 
mental results on our project pag(Q 

A. Data Description 

Our experiments are conducted based on a real-world relia¬ 
bility dataset recently released in 0. The dataset was collected 
using Amazon EC2 platform, which contains 17,150 reliability 
records from about 2.5 million invocations between 50 users 
and 49 services under 7 different workloads. Fig. [2] and [3] 
present some data statistics and the data distribution. Specifi¬ 
cally, the services are implemented as matrix multiplication 
operations with different computational complexities, while 
the users are simulated by a “stress testing” tool, loadUI. Both 
users and services are deployed into different locations across 
the seven EC2 regions. The service workload is controlled 
by setting different time intervals (i.e., 3~9 sec) between 
consecutive invocations. Each reliability value is calculated 
as the successful ratio of 150 consecutive service invocations. 

B. Evaluation Metrics 

To evaluate the prediction accuracy, we use two standard 
error metrics, MAE (Mean Absolute Error) and RMSE (Root 
Mean Square Error): 

MAE = |r(«, s,t) — r(u, s,t)\/N, (8) 

inv(u,s,t ) 

RMSE = rz (r(u,s,t) -r{u,s,t)) 2 /N, (9) 

y inv(u,s,t) 

where r(u,s,t ) and r(u,s,t) denote the observed reliability 
value and the corresponding predicted value respectively, for 
an invocation inv(u,s,t). N is the total number of testing 
samples to be predicted. Both metrics measure the average 
magnitude of the errors and smaller values indicate better pre¬ 
diction accuracy. Compared to MAE, RMSE gives relatively 
high weights to large errors and turns to be more suitable when 
large errors are particularly undesirable. These two metrics 
have also been adopted by the existing work 0, Do). 

1 http:// wsdream.github. io/CARP 


C. Accuracy Results 

We compare CARP with the following state-of-the-art ap¬ 
proaches that have been recently proposed for reliability 
prediction of Web services. 

• Baseline: This is a baseline approach that simply uses 
the overall average value of the observed reliability data 
for prediction. 

. Hybrid Qo): This approach models reliability prediction 
as a collaborative filtering (CF) problem, which is solved 
by combining two traditional CF approaches: user-based 
approach (UPCC) and item-based approach (IPCC). 

. CLUS 0: Based on K-means clustering, this approach 
clusters historical reliability data according to user- 
specific, service-specific, and environment-specific pa¬ 
rameters respectively, and hashes the average reliability 
value of each cluster for prediction. 

. PMF d|: This is a widely-used matrix factorization 
model, where the reliability data are modelled by a pre¬ 
defined low-rank matrix model. 

As we mentioned before, the observed reliability data are 
sparse in practice, because each user usually invokes only a 
small set of services out all of them. To simulate the data 
sparsity in our experiments, we randomly remove the entries 
from the data matrix R in our dataset, so that each user only 
keeps a few available historical reliability records. We use the 
remaining data for model construction and the removed values 
for accuracy evaluation. Specifically, we vary the data density 
from 5% to 25% at a step increase of 5%. Data density = 5%, 
for example, indicates that each user invokes only 5% of the 
services, and each service is invoked by 5% of the users. For 
Hybrid and CLUS , we employ the executable program with its 
parameters provided in 0. For PMF , we carefully tune the 
parameters and set d = 2 and A = 0.01 with best accuracy 
results. To make CARP consistent with other approaches, we 
set the number of context conditions C = 7 as with CLUS , 
and set d = 2, A = 0.01 as with PMF. Each experiment is run 
for 20 times and the average results are reported. 

Table [I] provides the prediction accuracy results of different 
approaches in terms of MAE and RMSE. We can see that 
CARP consistently outperforms the other approaches with 


TABLE I 

Accuracy Comparison 


Metric 

Approach 

Data Density 

5% 

10% 

15% 

20% 

25% 


Baseline 

0.176 

0.170 

0.168 

0.169 

0.170 


Hybrid flol 

0.152 

0.084 

0.079 

0.073 

0.073 

MAE 

CLUS (6) 

0.077 

0.059 

0.043 

0.036 

0.031 

PMF 1111 

0.076 

0.031 

0.021 

0.017 

0.014 


CARP 

0.045 

0.022 

0.017 

0.014 

0.013 


Impr.(%) 

41.0% 

27.8% 

20.3% 

13.9% 

13.2% 


Baseline 

0.217 

0.211 

0.210 

0.210 

0.211 


Hybrid flol 

0.204 

0.109 

0.102 

0.094 

0.094 

RMSE 

clus m 

0.112 

0.093 

0.066 

0.060 

0.052 

PMF (ill 

0.110 

0.050 

0.036 

0.031 

0.028 


CARP 

0.067 

0.037 

0.031 

0.029 

0.027 


Impr.(%) 

38.9% 

24.9% 

14.8% 

5.8% 

6.0% 












































(a) MAE (b) RMSE 

Fig. 4. Impact of Data Sparsity 

smaller MAE and RMSE. Compared to the most competitive 
results of PMF, CARP still has 13.2%~41.0% improvement on 
MAE and 6.0%~38.9% improvement on RMSE. It indicates 
that our CARP model fits the reliability data better and works 
well for context-aware reliability prediction. In particular, 
larger improvement can be achieved at smaller data density 
(e.g., the largest improvement is achieved at data density 
= 5%), which demonstrates the effectiveness of CARP in 
alleviating the data sparsity problem for reliability prediction. 

D. Impact of Data Sparsity 

To study the impact of data sparsity on prediction accuracy, 
we evaluate CARP by varying the data density from 5% 
to 50% at a step increase of 5%. A lower data density 
indicates a higher data sparsity because more data are removed 
during data processing. Fig. [4] presents the evaluation results 
(average with 95% confidence interval) on both MAE and 
RMSE. We can observe that better prediction accuracy can be 
achieved with the increase of data density from 5% to 50%: 
MAE decreases from 0.045 to 0.009 and RMSE decreases 
from 0.067 to 0.022. The results show that more training 
data can usually provide more useful information for model 
construction and thus achieve better prediction accuracy. In 
particular, the significant fluctuation of the curve, when the 
data is extremely sparse (e.g., data density = 5%), further 
confirms that data sparsity is a great challenge in achieving 
accurate reliability prediction. Our CARP approach takes a 
first step forward for addressing the data sparsity challenge 
and achieves significant improvement. 

V. Related Work 

Reliability prediction is an important research issue in 
software reliability engineering that has been widely stud¬ 
ied in the last decades. Current research, however, mostly 
targets at analyzing traditional white-box software systems 
with additional inforamtion regarding the behaviours of the 
internal components £9- For modern software services such 
as Web services and Web APIs, they typically provide black¬ 
box functionalities to service users, thus leading to a lack of 
internal behaviour information of the services (except some 
API documents). In addition, software services usually provide 
remote service access through the Internet. The user-perceived 
reliability, therefore, not only depends on the service itself but 
also relies on the network connections between them. These 
new challenges make existing prediction models unsuitable for 


reliability prediction of black-box services. In this paper, we 
propose to address the problem of user-perceived reliability 
of black-box services. We present a novel approach that can 
exploit historical usage data from users for context identifica¬ 
tion of service invocations and can further leverage them for 
context-aware reliability prediction. 

Current research has seldom focused on user-perceived 
reliability prediction of software services. Zheng et al. m 
make the first effort in this direction, where they only employ 
historical usage data from users for reliability prediction and 
model it as a collaborative filtering problem. Collaborative 
filtering (CF) Q is a well-studied technique for rating pre¬ 
diction in recommender systems, which consists of two types 
of approaches: neighbourhood-based approaches and model- 
based approaches. In HQ), they propose a neighbourhood- 
based approach. Hybrid, which combines two traditional 
CF approaches: user-based CF (UPCC) and item-based CF 
(IPCC). The following work HE further extends a model- 
based approach, matrix factorization (PMF), to address this 
problem. However, these models only consider user-specific 
and service-specific parameters. Silic et al. ED make a further 
step forward and incorporate environment-specific parameters 
for reliability prediction. This approach achieves scalability by 
clustering reliability data according to user-specific, service- 
specific, and environment-specific parameters, but sacrifices 
prediction accuracy (which is worse than PMF as shown in 
Table [T]). Our approach, instead, addresses these limitations on 
accuracy and scalability by performing context-aware reliabil¬ 
ity prediction. 

VI. Conclusion and Future Work 

This paper presents CARP, a context-aware reliability pre¬ 
diction approach for user-perceived reliability prediction of 
black-box services. CARP exploits historical usage data from 
users to assess the observed reliability of services, and further 
leverage them to construct context-aware reliability models. 
Through context-aware model training and prediction, CARP 
is capable of alleviating the data sparsity problem that heavily 
limits the existing approaches. The preliminary experimental 
results show that CARP makes a significant improvement in 
prediction accuracy over the state-of-the-art approaches. 

The use of data-driven approaches is promising for the 
quality management of black-box services in the field. We 
believe CARP can serve as a good starting point towards 
this end. As part of our future work, we plan to: 1) develop 
more robust reliability prediction approaches to handle the data 
collection from malicious users and services, 2 ) consider data 
privacy when performing collaborative reliability prediction, 
and 3) perform reliability evaluations for real-world services 
to help identify and address reliability issues. 
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norm a, and Xu, Xs are two parameters to control the extent 
of regularization. 

B. Gradient Descent 

Gradient descent is a widely used method to find a local 
minimum of an objective function in an iterative way. As 
for the PMF model expressed in Equation [TO] the gradient 
algorithm works by updating U, and Sj simultaneously from 
random initialization using the following updating mles: 


Ui 

Sj 


Ui-v 
- Sj-V 


dC 
dUf 
dC 
dSi ’ 


( 11 ) 

( 12 ) 


where the derivatives of Ui and Sj can be obtained from 
Equation [T0| as follows: 


dC 

dU t 

dC 

dSj 


= -S',. - Rij)Sj + XuUi, 

3 = 1 
n 

= I] UjiUjSj - Rij)Ui + XsSj. 
i=1 


(13) 

(14) 


Hence, we derive the following updating rules: 


[11] Z. Zheng and M. R. Lyu. Personalized reliability prediction of web 
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Appendix 

This section provide some background of the matrix fac¬ 
torization model and the gradient descent algorithm, based on 
which we develop our CARP approach. 

A. Matrix Factorization 

Matrix factorization (e.g., PMF 0) is a classic model to 
address the collaborative filtering problem, which constrains 
the rank of the QoS matrix, i.e., rank(R) = d. The low-rank 
assumption is based on the fact that the entries of R are largely 
correlated, thereby resulting in a low effective rank in R. For 
instance, close users may have similar network conditions, and 
thus experience similar QoS on the same service. Concretely, 
factoring a matrix is to map both users and services into a 
joint latent factor space of a low dimensionality d such that 
values of the user-service QoS matrix can be captured as 
inner products of latent factors in that space. Then the latent 
factors can be employed for further prediction on unknown 
QoS values. 

Formally, latent user factors are denoted as U £ R dxra 
and latent service factors as S' € R dxm , which are used to 
fit the QoS matrix R, i.e., R ss U T S. To avoid overfitting, 
regularization terms that penalize the norms of the solutions 
(i.e., U and S) are added. Thus we aim to minimize the 
following loss function: 

-UTSj) 2 + ^||E / |£ + ^ | |S | |L (10) 

*=1 J =1 

where /,; 7 acts as an indicator that equals to 1 if R.,j 
is observed, and 0 otherwise. ||-|| F denotes the Frobenius 


Ui^Ui- //(^/ntr.' S: - R. ; )Sj + XuUi ), (15) 

3 = 1 
n 

Sjir- Sj - r?( lij (Uf Sj - Rij)Ui + X s Sj). (16) 

i=l 

In this way, the latent factors Ui and Sj move iteratively by a 
small step of the average gradients, i.e., and where 
the step size is controlled by a learning rate r]. Such iterations 
will continue until the convergence obtained. The detailed 
description of the algorithm is provided in Algorithm |T| 


Algorithm 1: Gradient Descent for MF 


1 

2 

3 

4 

5 

6 

7 

8 
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10 
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Input: The collected QoS matrix R, the indication matrix I, 

and the model parameters: 77 , Au and A 5 . /* Rj = 1 

if Rij is known; otherwise, Rj =0 */ 
Output: The QoS prediction results: Rij, where Rj — 0. 

Initialize U £ R dxn and S £ R dXm randomly; 
repeat /* Batch-mode updating */ 

foreach (i, j) do /* Compute and -§§- */ 

m 

<- E Iij{UfSj - Rij)Sj + XuUi-, 

3 = 1 
n 

<- E Iij(UlSj - Rij)Ui + XsSj- 

1 i =1 

end 

fo reach (i, j) do /* Update each Ui and Sj */ 

Ui^Ui- r 1 §§.-. 


Q . , c. _ „ dC . 

^ Ai V dSj ’ 

end 

until converge', 


12 foreach (i, j) £ {Rj = 0} do /* Make prediction */ 

13 | Rij = Uf Sj ; 

14 end 








